-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix capi-kubeadmconfig rule for hybrid providers #1494
Conversation
expr: capi_kubeadmconfig_status_condition{type="Ready", status="False"} > 0 | ||
expr: |- | ||
( | ||
app_operator_app_info{status="not-installed", catalog=~"giantswarm|cluster|default", team!~"^$|noteam"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does that have to do with the app operator? I think the original metrics should bé capi_kubeadmconfig_status_condition{type="Ready
right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
welp, what did I do!?? I screwed the query with a broken copy/paste, and there were no unit tests to catch it! 😱
Thanks for catching it!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
expr: |- | ||
( | ||
capi_kubeadmconfig_status_condition{type="Ready", status="False"} | ||
* on(cluster_id) group_left(provider) | ||
sum( | ||
label_replace( | ||
capi_cluster_info, "provider", "vsphere", "infrastructure_reference_kind", "VSphereCluster" | ||
) | ||
) by (cluster_id, provider) | ||
) > 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not an expert on PromQL so I may be miss understanding the query, but aren't we basically disabling the alert for non vsphere
clusters? We want this alert for all providers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not at all :)
What we are doing here is replace the provider label to vsphere if the cluster type is a VSphereCluster
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes more sense 😄 Thanks! I guess the same would be needed for cloud-director?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do not need it for cloud-director because we do not have cloud director WCs running on cloud MCs yet.
We might need this for promox and so on i guess
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I would prefer that we try to find a better solution in the long run :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds good. Thank y'all for the explanations!
Towards https://github.com/giantswarm/giantswarm/issues/32587
This PR fixes multi-provider routing for
KubeadmConfigNotReady
alert.Checklist
oncall-kaas-cloud
GitHub group).